A Probabilistic Lexical Approach to Textual Entailment

نویسندگان

  • Oren Glickman
  • Ido Dagan
  • Moshe Koppel
چکیده

The textual entailment problem is to determine if a given text entails a given hypothesis. This paper describes first a general generative probabilistic setting for textual entailment. We then focus on the sub-task of recognizing whether the lexical concepts present in the hypothesis are entailed from the text. This problem is recast as one of text cate-gorization in which the classes are the vocabulary words. We make novel use of Naïve Bayes to model the problem in an entirely unsupervised fashion. Empirical tests suggest that the method is effective and compares favorably with state-of-the-art heuristic scoring approaches. Many Natural Language Processing (NLP) applications need to recognize when the meaning of one text can be expressed by, or inferred from, another text. Information Retrieval (IR), Question Answering (QA), Information Extraction (IE) and text summarization are examples of applications that need to assess such semantic overlap between text segments. Textual Entailment Recognition has recently been proposed as an application independent task to capture such semantic inferences and variability [Dagan et al., 2005]. A text t textually entails a hypothesis h if t implies the truth of h. Textual entailment captures generically a broad range of inferences that are relevant for multiple applications. For example, a QA system has to identify texts that entail the expected answer. Given the question "Where was Harry Reasoner born?", a text that includes the sentence "Harry Reasoner's birthplace is Iowa" entails the expected answer form "Harry Reasoner was born in Iowa." In many cases, though, entailment inference is uncertain and has a probabil-istic nature. For example, a text that includes the sentence "Harry Reasoner is returning to his Iowa hometown to get married." does not deterministically entail the above answer form. Yet, it is clear that it does add substantial information about the correctness of the hypothesized assertion. A Probabilistic Setting We propose a general generative probabilistic setting for textual entailment. We assume that a language source generates texts within the context of some state of affairs. Thus, texts are generated along with hidden truth assignments to hypotheses. We define two types of events over the corresponding probability space: I) For a hypothesis h, we denote as Tr h the random variable whose value is the truth value assigned to h in the world of the generated text. Correspondingly, Tr h =1 is the event of h being assigned a truth value of 1 (True). II) For …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Probabilistic Setting And Lexical Coocurrence Model For Textual Entailment

This paper proposes a general probabilistic setting that formalizes a probabilistic notion of textual entailment. We further describe a particular preliminary model for lexical-level entailment, based on document cooccurrence probabilities, which follows the general setting. The model was evaluated on two application independent datasets, suggesting the relevance of such probabilistic approache...

متن کامل

A Probabilistic Setting and Lexical Cooccurrence Model for Textual Entailment

This paper proposes a general probabilistic setting that formalizes a probabilistic notion of textual entailment. We further describe a particular preliminary model for lexical-level entailment, based on document cooccurrence probabilities, which follows the general setting. The model was evaluated on two application independent datasets, suggesting the relevance of such probabilistic approache...

متن کامل

A Lexical Alignment Model for Probabilistic Textual Entailment

This paper describes the Bar-Ilan system participating in the Recognising Textual Entailment Challenge. The paper proposes first a general probabilistic setting that formalizes the notion of textual entailment. We then describe a concrete alignment-based model for lexical entailment, which utilizes web co-occurrence statistics in a bag of words representation. Finally, we report the results of ...

متن کامل

A Probabilistic Classification Approach for Lexical Textual Entailment

The textual entailment task – determining if a given text entails a given hypothesis – provides an abstraction of applied semantic inference. This paper describes first a general generative probabilistic setting for textual entailment. We then focus on the sub-task of recognizing whether the lexical concepts present in the hypothesis are entailed from the text. This problem is recast as one of ...

متن کامل

Web Based Probabilistic Textual Entailment

This paper proposes a general probabilistic setting that formalizes the notion of textual entailment. In addition we describe a concrete model for lexical entailment based on web co-occurrence statistics in a bag of words representation.

متن کامل

Towards a Probabilistic Model for Lexical Entailment

While modeling entailment at the lexical-level is a prominent task, addressed by most textual entailment systems, it has been approached mostly by heuristic methods, neglecting some of its important aspects. We present a probabilistic approach for this task which covers aspects such as differentiating various resources by their reliability levels, considering the length of the entailed sentence...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005